day 14 Yolo辨識輪匡是否變形

2024 iThome 鐵人賽

DAY 14

Software Development

LSTM結合Yolo v8對於多隻斑馬魚行為分析系列第 14 篇

16th鐵人賽

neilsu02

2024-08-16 22:03:46

422 瀏覽

分享至

今天是第14天我們平常騎公路車或腳踏車可能擔心輪匡變形導致安全問題，所以我們可以寫一個yolo程式辨識是否變形，以下是yolo程式

import cv2
import numpy as np

def load_yolo_model(weight_path, cfg_path, names_path):
    # 讀取 YOLO 模型
    net = cv2.dnn.readNet(weight_path, cfg_path)
    with open(names_path, "r") as f:
        classes = [line.strip() for line in f.readlines()]
    layer_names = net.getLayerNames()
    output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
    return net, classes, output_layers

def detect_deformation(image_path, net, output_layers, classes):
    # 載入圖片
    image = cv2.imread(image_path)
    height, width, channels = image.shape

    # YOLO 預處理
    blob = cv2.dnn.blobFromImage(image, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
    net.setInput(blob)
    outs = net.forward(output_layers)

    # 分析 YOLO 的輸出
    class_ids = []
    confidences = []
    boxes = []
    for out in outs:
        for detection in out:
            scores = detection[5:]
            class_id = np.argmax(scores)
            confidence = scores[class_id]
            if confidence > 0.5:
                # 物體偵測
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * height)
                w = int(detection[2] * width)
                h = int(detection[3] * height)

                # 矩形框座標
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)

                boxes.append([x, y, w, h])
                confidences.append(float(confidence))
                class_ids.append(class_id)

    # 應用非極大值抑制來消除多餘的框
    indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)

    # 繪製結果
    for i in range(len(boxes)):
        if i in indexes:
            x, y, w, h = boxes[i]
            label = str(classes[class_ids[i]])
            confidence = confidences[i]
            color = (0, 255, 0) if label == "Deformed" else (255, 0, 0)
            cv2.rectangle(image, (x, y), (x + w, y + h), color, 2)
            cv2.putText(image, f"{label} {confidence:.2f}", (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

    # 顯示結果
    cv2.imshow("Image", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

# 使用示例
weight_path = "yolov3.weights"  # 替換成你的權重檔案路徑
cfg_path = "yolov3.cfg"         # 替換成你的配置檔案路徑
names_path = "coco.names"       # 替換成你的名稱檔案路徑
image_path = "bicycle.jpg"      # 替換成你要分析的圖片路徑

net, classes, output_layers = load_yolo_model(weight_path, cfg_path, names_path)
detect_deformation(image_path, net, output_layers, classes)

1. 載入 YOLO 模型

def load_yolo_model(weight_path, cfg_path, names_path):
    net = cv2.dnn.readNet(weight_path, cfg_path)  # 載入YOLO模型
    with open(names_path, 'r') as f:
        classes = [line.strip() for line in f.readlines()]  # 讀取類別名稱
    layer_names = net.getLayerNames()
    output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]  # 設定輸出層
    return net, classes, output_layers

cv2.dnn.readNet(weight_path, cfg_path)：這行程式碼是用來載入 YOLO 模型的「大腦」和「設定檔」的。`
1. 載入 YOLO 模型（繼續）

net = cv2.dnn.readNet(weight_path, cfg_path)  # 載入YOLO模型

cv2.dnn.readNet(weight_path, cfg_path)：這行程式碼是用來載入 YOLO 模型的「大腦」和「設定檔」的。weight_path 是模型的權重檔案，它儲存了模型經過大量訓練後的知識，而 cfg_path 是模型的配置檔案，告訴模型應該如何處理資料。

with open(names_path, 'r') as f:
    classes = [line.strip() for line in f.readlines()]  # 讀取類別名稱

讀取類別名稱：這段程式碼會打開 names_path 檔案，這個檔案包含了模型能夠辨識的所有物件的名稱。每一行對應一個類別名稱，例如「車輪變形」或「正常」。這樣做的目的是讓模型偵測到物件後，能夠把結果轉換成文字顯示出來。

layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]  # 設定輸出層

設定輸出層：這部分程式碼用來設定 YOLO 模型的輸出層。net.getLayerNames() 會返回模型所有層的名稱，而 net.getUnconnectedOutLayers() 會給出未連接的輸出層索引。這些輸出層是模型最後決策的地方，這些層的輸出就是模型的偵測結果。

return net, classes, output_layers

返回模型和設定：最後，我們把模型 (net)、物件類別 (classes)，以及模型的輸出層 (output_layers) 返回，這樣我們就可以在其他地方使用這些資訊來進行物件偵測。

2. 偵測圖片中的物件

def detect_deformation(image_path, net, output_layers, classes):
    img = cv2.imread(image_path)  # 讀取圖片
    height, width, _ = img.shape

讀取圖片：cv2.imread(image_path) 會從指定路徑載入圖片，並將其存入變數 img 中。img.shape 會返回圖片的高度、寬度和通道數（這裡通道數不需要用到，所以用 _ 來忽略）。

blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False)  # 預處理圖片
net.setInput(blob)
outs = net.forward(output_layers)  # 執行偵測

預處理圖片：cv2.dnn.blobFromImage 會將圖片轉換為模型能夠理解的格式。這一步就像是把圖片縮放、標準化，然後打包成「小餅乾」一樣，模型會更容易處理。setInput(blob) 是把這個預處理過的圖片送進模型，準備進行偵測。
執行偵測：net.forward(output_layers) 會讓模型開始對圖片進行物件偵測。outs 是模型輸出的結果，包含了所有偵測到的物件資訊。

3. 處理偵測結果

class_ids = []
confidences = []
boxes = []

初始化列表：我們會用這三個空列表來儲存模型偵測到的結果，包括類別 ID (class_ids)、置信度 (confidences)，以及物件的邊界框 (boxes)。

for out in outs:
    for detection in out:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]
        if confidence > 0.5:  # 只保留信心度大於50%的偵測結果
            center_x = int(detection[0] * width)
            center_y = int(detection[1] * height)
            w = int(detection[2] * width)
            h = int(detection[3] * height)
            x = int(center_x - w / 2)
            y = int(center_y - h / 2)
            boxes.append([x, y, w, h])
            confidences.append(float(confidence))
            class_ids.append(class_id)

解析模型輸出：這一段迴圈會分析每個偵測結果。detection[5:] 是模型對各個類別的預測分數，我們用 np.argmax(scores) 來找到分數最高的類別（即模型認為最有可能的物件類型），並計算出物件的位置和大小（x、y、w、h）。
過濾低置信度結果：我們只保留置信度大於 50% 的結果，因為這些結果更可靠。通過計算，我們得到了物件的邊界框，並將其儲存在 boxes 列表中。confidences 列表儲存了每個框的置信度，而 class_ids 列表儲存了對應的類別 ID。

indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)  # 過濾重複框

過濾重複框：cv2.dnn.NMSBoxes 使用非極大值抑制（NMS）來過濾掉重疊的框，保留置信度最高的框。這樣可以避免模型在同一個位置偵測到多個重疊的框。

4. 顯示結果

font = cv2.FONT_HERSHEY_PLAIN
for i in range(len(boxes)):
    if i in indexes:
        x, y, w, h = boxes[i]
        label = str(classes[class_ids[i]])
        color = (0, 255, 0) if label == "正常" else (0, 0, 255)
        cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)  # 繪製框
        cv2.putText(img, label, (x, y - 10), font, 1, color, 2)  # 加上標籤

繪製偵測結果：這段程式碼會在圖片上繪製偵測到的框。如果偵測到的是「正常」的輪匡，框會顯示為綠色；如果是「變形」的輪匡，框會顯示為紅色。cv2.rectangle 用來畫出框，而 cv2.putText 用來在框上加上類別名稱標籤。

cv2.imshow("Image", img)  # 顯示圖片
cv2.waitKey(0)
cv2.destroyAllWindows()

顯示圖片：最後，cv2.imshow 會顯示處理過的圖片，其中包含了偵測到的輪匡狀態（變形或正常）。cv2.waitKey(0) 讓圖片保持在螢幕上，直到你按下任意鍵關閉圖片視窗。

5. 主程式

weight_path = "yolov3.weights"  # YOLO的權重檔案
cfg_path = "yolov3.cfg"  # YOLO的配置檔案
names_path = "coco.names"  # 物件類別名稱檔案
image_path = "bicycle.jpg"  # 檢測的圖片路徑

net, classes, output_layers = load_yolo_model(weight_path, cfg_path, names_path)
detect_deformation(image_path, net, output_layers, classes)